Skip to content

[Improv]: enable max ordinal check after write state engine prepare write;#787

Merged
azeng-netflix merged 1 commit intomasterfrom
azeng-max-ordinal-check
Feb 5, 2026
Merged

[Improv]: enable max ordinal check after write state engine prepare write;#787
azeng-netflix merged 1 commit intomasterfrom
azeng-max-ordinal-check

Conversation

@azeng-netflix
Copy link
Collaborator

@azeng-netflix azeng-netflix commented Jan 21, 2026

Summary:
This change adds configurable soft limit validation to ByteArrayOrdinalMap and exposes ordinal map
statistics to PublishListeners for monitoring.

Changes:

  1. Soft Limit Enforcement
    - Adds soft limit threshold at 268,435,456 (2^28) - half of the hard limit
    - Throws exception when soft limit exceeded (configurable behavior)
    - Override mechanism via HollowProducer.withIgnoreSoftLimits(Supplier)
    - One-time logging per cycle to prevent log spam
  2. Stats Exposure
    - New ByteArrayOrdinalMapStats class captures maxOrdinal, byteDataLength, and loadFactor
    - New PublishStageStats wraps stats for all type write states
    - Extended PublishListener.onPublishComplete() to receive stats (backward compatible)
    - Stats collected after successful publish and passed through listener chain
    3. Perf Improvement
    - Remove redundant scan of pointersAndOrdinals to find the max ordinal in HollowWriteTypeState. Instead, find and return the max ordinal when doing prepareForWrite

Backward Compatibility:

  • Default constructor maintains existing behavior (soft limits disabled by default)
  • Deprecated onPublishComplete method preserved with default implementation
  • No breaking changes to existing API

@azeng-netflix azeng-netflix force-pushed the azeng-max-ordinal-check branch from 7745b1a to 53aef77 Compare January 23, 2026 22:02
* @param ignoreOrdinalThresholdBreach the supplier that returns true to ignore threshold breaches
* @return this builder
*/
public B withIgnoreOrdinalThresholdBreachConfig(Supplier<Boolean> ignoreOrdinalThresholdBreach) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can make this more generic to all soft limits- ignoreSoftLimits

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private static final long ORDINAL_MASK = (1L << BITS_PER_ORDINAL) - 1;
private static final long MAX_BYTE_DATA_LENGTH = 1L << BITS_PER_POINTER;
// a safe limit for the ordinal. Even though the ordinal value range is [0, (1<<29) - 1], in production,
// the user should be warned whenever the ordinal value exceeds (1<<28) - 1.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: drop "in production" and replace "the user should be warned" with something like- fail the producer cycle unless soft limits are overridden

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


if (ordinal >= SOFT_ORDINAL_LIMIT) {
if (logOrdinalThresholBreach && ignoreOrdinalThresholdBreach) {
logOrdinalThresholBreach = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo in logOrdinalThresholBreach - but I guess it will be renamed to logSoftLimitBreach anyway

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*/
public void prepareForNextCycle() {
ordinalMap.compact(currentCyclePopulated, numShards, stateEngine.isFocusHoleFillInFewestShards());
ordinalMap.setIgnoreOrdinalThresholdBreach(ignoreOrdinalThresholdBreachSupplier == null ? false : ignoreOrdinalThresholdBreachSupplier.get());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here and anywhere else where there isn't a way to override the soft limit failure, lets default to not failing on soft limits

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*/
void onBlobPublish(Status status, HollowProducer.Blob blob, Duration elapsed);

default void onPublishHeaderBlob(HollowWriteStateEngine writeStateEngine) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanup?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used for listener to do customized report

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coupling with onPublishHeaderBlob doesn't feel great, can we explore reusing the onPopulateComplete listener method?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved it to onPublishComplete.
The reason of not doing it in onPopulateComplete, is that the WriteStateEngine has not been prepared for write yet.

@azeng-netflix azeng-netflix force-pushed the azeng-max-ordinal-check branch 3 times, most recently from 308129f to a80d8cc Compare January 30, 2026 01:00
@azeng-netflix azeng-netflix marked this pull request as ready for review January 30, 2026 17:54
@azeng-netflix
Copy link
Collaborator Author

@azeng-netflix Before merge, check again to see if missing any places where the limit supplier are not properly propagated/copied over.

}

// check if ByteArrayOrdinalMap.ignoreSoftLimits has been updated to the expected value.
private void checkBaomIgnoreOrdinalSoftLimit(boolean expected, HollowProducer producer) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: Modified this test file to check if ignore ordinal limit can be updated between cycles/restore

*
* @return a {@link ByteArrayOrdinalMapStats} containing map statistics
*/
public ByteArrayOrdinalMapStats getOrdinalMapStatsAfterPreparation() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could simplify name to getStats , the clarification you have in javadoc feels sufficient

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to getOrdinalMapStats in case in the future other stats also need to be exposed from HollowTypeWriteState

//// adjust number of shards per type during the course of the delta chain to realize consumer-side delta applications at constant space overhead
private boolean allowTypeResharding = false;
//// supplier to determine whether to ignore ordinal threshold breaches (log warning instead of throwing exception)
private Supplier<Boolean> ignoreOrdinalSoftLimitsSupplier;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: name this more generic ignoreSoftLimitsSupplier, becuase we would reuse it for any other limits. Could also drop supplier from the name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to ignoreSoftLimits

}
// continue silently after first log
} else {
String message = String.format("Ordinal %d exceeds the soft ordinal limit of %d.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: adding a hint in the log msg about support for ignoring soft limits could be useful

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this(schema, -1);
}

public HollowListTypeWriteState(HollowListSchema schema, Supplier<Boolean> ignoreOrdinalSoftLimitsSupplier) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think we really need this new constructor, could just reuse the other one you added?

}

public HollowListTypeWriteState(HollowListSchema schema, int numShards) {
super(schema, numShards);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: changing to this(schema, numShards, null) would be a bit more readable

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* @param version version that was published.
* @param elapsed duration of the publish stage in {@code unit} units
*/
default void onPublishComplete(HollowWriteStateEngine writeStateEngine, Status status, long version, Duration elapsed) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to add a Stats within the State object, where different stages like publish, etc. could implement their stats like PublishStage could implement stats like duration which could be generic to all stages, or specific ones like ordinap map stats for publish stage. If you think it fits well, otherwise the current approach seems fine too, but if we're doing this then lets also deprecate the older 3 param onPublishComplete, and make this default impl call that one. Could also add to the javadoc- The default implementation delegates to the deprecated 3 param onPublishComplete for backwards compatibility.

Copy link
Collaborator Author

@azeng-netflix azeng-netflix Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this commit to address the comment, but during the implementation, I think a general Stats would require the Listener side to do type casting like stats instanceof PublishStageStats. would like your opinion on this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add and use PublishListener.onPublishComplete(Status status, long version, Duration elapsed, PublishStageStats stats)

@azeng-netflix azeng-netflix force-pushed the azeng-max-ordinal-check branch 3 times, most recently from 4952801 to 9f0ef02 Compare February 3, 2026 23:46
…poses ordinal map

statistics to PublishListeners for monitoring.
@azeng-netflix azeng-netflix force-pushed the azeng-max-ordinal-check branch from 9f0ef02 to 775b98a Compare February 4, 2026 23:38
@azeng-netflix azeng-netflix merged commit d417781 into master Feb 5, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants